AITopics | learning non-convergent non-persistent short-run mcmc

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Neural Information Processing SystemsDec-25-2025, 04:12:48 GMT

This paper studies a curious phenomenon in learning energy-based model (EBM) using MCMC. In each learning iteration, we generate synthesized examples by running a non-convergent, non-mixing, and non-persistent short-run MCMC toward the current model, always starting from the same initial distribution such as uniform noise distribution, and always running a fixed number of MCMC steps. After generating synthesized examples, we then update the model parameters according to the maximum likelihood learning gradient, as if the synthesized examples are fair samples from the current model. We treat this non-convergent short-run MCMC as a learned generator model or a flow model. We provide arguments for treating the learned non-convergent short-run MCMC as a valid model. We show that the learned short-run MCMC is capable of generating realistic images. More interestingly, unlike traditional EBM or MCMC, the learned short-run MCMC is capable of reconstructing observed images and interpolating between images, like generator or flow models. The code can be found in the Appendix.

learning non-convergent non-persistent short-run mcmc, mcmc, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Neural Information Processing SystemsJan-22-2025, 12:35:18 GMT

The highlighted phenomenon (the convergence of a short-run MCMC while training EBMs) seems to be novel and very interesting. The conventional wisdom is that a simple MCMC algorithm like Langevin dynamics would take a long time to converge close to the stationary distribution of the EBM when initialized far from it. The paper argues that in fact if the EBM is trained by generating negative samples from a short-run MCMC, then the short-run MCMC chain would in fact converge close to the data distribution (the authors argue that the "closeness" is related to moment matching). The theoretical argument for explaining this phenomenon seems suggestive, but ultimately didn't convince the reviewer (even convergence of the algorithm seems to be not explained, and section 4.2 seems particularly weak - it's not clear what the "generalized moment matching objective" is trying to achieve). However the empirical evidence for the convergence of short-run MCMC in EBMs seems very compelling - the training procedure for the model is significantly simpler than other procedures used to train EBMs, yet produces highly competitive results on several image datasets.

convergence, energy-based model, learning non-convergent non-persistent short-run mcmc, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Reviews: Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Neural Information Processing SystemsJan-22-2025, 12:15:34 GMT

A new energy-based generative model for images is proposed. The paper suggests to run Langevin dynamics in the data domain to create artificial samples, and updating the model parameters based on these synthesized images in an'analysis by synthesis' framework. The generative model allows for unconditional generation and interpolation. It is interesting that short-run MCMC can be used in this context despite not being converged. The effect of the hyperparameter K (number of MCMC steps) could have been more explored.

energy-based model, learning non-convergent non-persistent short-run mcmc

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)

Add feedback

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Neural Information Processing SystemsOct-9-2024, 17:43:51 GMT

This paper studies a curious phenomenon in learning energy-based model (EBM) using MCMC. In each learning iteration, we generate synthesized examples by running a non-convergent, non-mixing, and non-persistent short-run MCMC toward the current model, always starting from the same initial distribution such as uniform noise distribution, and always running a fixed number of MCMC steps. After generating synthesized examples, we then update the model parameters according to the maximum likelihood learning gradient, as if the synthesized examples are fair samples from the current model. We treat this non-convergent short-run MCMC as a learned generator model or a flow model. We provide arguments for treating the learned non-convergent short-run MCMC as a valid model.

learning non-convergent non-persistent short-run mcmc, mcmc, short-run mcmc, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Nijkamp, Erik, Hill, Mitch, Zhu, Song-Chun, Wu, Ying Nian

Neural Information Processing SystemsMar-18-2020, 22:33:12 GMT

This paper studies a curious phenomenon in learning energy-based model (EBM) using MCMC. In each learning iteration, we generate synthesized examples by running a non-convergent, non-mixing, and non-persistent short-run MCMC toward the current model, always starting from the same initial distribution such as uniform noise distribution, and always running a fixed number of MCMC steps. After generating synthesized examples, we then update the model parameters according to the maximum likelihood learning gradient, as if the synthesized examples are fair samples from the current model. We treat this non-convergent short-run MCMC as a learned generator model or a flow model. We provide arguments for treating the learned non-convergent short-run MCMC as a valid model.

learning non-convergent non-persistent short-run mcmc, mcmc, short-run mcmc, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Collaborating Authors

learning non-convergent non-persistent short-run mcmc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Reviews: Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Reviews: Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model